SAGEO, 2023 - Québec, Canada
07 Jun 2023
Python : un langage polyvalent, interprété et multi-paradigme
De plus en plus utilisé pour la data science
“GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and matplotlib for plotting.”
Format des GeoDataFrame
array NumPyExemple d’utilisation
Yes
No
Maybe
en utilisant la variable r, un point, et le nom de la variable R à récupérer
data.frame ⇄ DataFrame pandas, tableau numpy, etc.)import pandas as pd
import numpy as np
df2 = pd.DataFrame({
"emp_id": list(range(5)),
"emp_name": ["Rick", "Dan", "Michelle", "Ryan", "Jane"],
"salary": [623.3, 515.2, 611.0, 729.0, 843.25],
"start_date": pd.to_datetime(["2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27"]),
})
arr = np.array([[12, 47], [34, 90], [23, 19]])Utilisation des widgets Jupyter possible (seulement is utilisation de l’engin de rendu jupyter, pas avec knitr - i.e. l’inverse des htmlwidgets en R qui ne fonctionne que si knitr est utilisé)
Exemple :
reg_fp = './geom/regio_l.shp'
mun_fp = './geom/munic_s.shp'
lc_fp = '../../Téléchargements/T01_PROVINCE.tif'
lc_categories_fp = '../../Téléchargements/correspondance_raster_CL_COUV.dbf'
dst_epsg_code = 6623
regio_s = gpd.read_file(reg_fp).to_crs(epsg=dst_epsg_code)
munic_s = gpd.read_file(mun_fp).to_crs(epsg=dst_epsg_code)dst_crs = f'EPSG:{dst_epsg_code}'
with rio.open(lc_fp) as src:
transform, width, height = calculate_default_transform(
src.crs, dst_crs, src.width, src.height, *src.bounds)
kwargs = src.meta.copy()
kwargs.update({
'crs': dst_crs,
'transform': transform,
'width': width,
'height': height
})
# Read first band
band = src.read(1)
# Create the destination array
img = np.zeros_like(band)
# Reproject and store the result in the
# destination array
reproject(
band,
img,
src_transform=src.transform,
src_crs=src.crs,
dst_transform=transform,
dst_crs=dst_crs,
resampling=Resampling.nearest,
)
# Store the extent of our raster data (we will need it later for plotting)
left = transform[2]
top = transform[5]
right = left + transform[0] * width
bottom = top + transform[4] * height
extent = [left, right, bottom, top](array([[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
...,
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255],
[255, 255, 255, ..., 255, 255, 255]], dtype=uint8), Affine(50.00000000000006, 0.0, -830584.8693000015,
0.0, -50.00000000000006, 1303502.274700009))
# Replace '255' values by '0'
img[img == 255] = 0
# Create an empty figure
fig, ax = plt.subplots(figsize=(16, 12))
# Plot the raster using the previously computed extent
show(img, ax=ax, extent=extent, cmap="Set3")
# Plot Québec municipality on top
extract.plot(ax=ax, color='red', edgecolor='red', linewidth=2)# Open the DBF file that contains the correspondences between the codes and the names of land use categories:
categories = pd.DataFrame(gpd.read_file(lc_categories_fp, encoding='utf-8')[['ID', 'CL_COUV', 'Descriptio']])
categories.head(11) ID CL_COUV Descriptio
0 1.0 010000 Surfaces artificielles
1 2.0 020000 Terres agricoles
2 3.0 060000 Milieux humides forestiers
3 4.0 070000 Milieux humides herbacés ou arbustifs
4 5.0 100000 Plans et cours d’eau intérieure
5 6.0 110101 Forêts de conifères à couvert fermé
6 7.0 110102 Forêts de feuillus à couvert fermé
7 8.0 110103 Forêts mixtes à couvert fermé
8 9.0 110200 Forêts à couvert ouvert
9 10.0 000000 Pas de données
10 11.0 000001 En attente de traitement
[{1: 85316, 2: 12475, 3: 2737, 4: 756, 5: 3494, 6: 3556, 7: 38023, 8: 38147, 9: 69, 10: 43}, {8: 4}]
# We convert the dict objects into Counter objects, which allows us to add them up by simply using the "+" operator.
from collections import Counter
stats = Counter(stats[0]) + Counter(stats[1])
stats = dict(stats) # Convert back to dict
stats{1: 85316, 2: 12475, 3: 2737, 4: 756, 5: 3494, 6: 3556, 7: 38023, 8: 38151, 9: 69, 10: 43}
# Use the category names from the `categories` DataFrame
x = [
categories[categories.ID == int(v)]['Descriptio'].iloc[0]
for v in stats.keys()
]
# and the values we just computed with the `zonal_stats` function
height = stats.values()
fig, ax = plt.subplots(figsize=(8, 5))
bar_container = ax.bar(x, height)
ax.set_xticklabels(
ax.get_xticklabels(),
rotation=45,
horizontalalignment='right'
)
ax